feat: add E2E test framework for Electron applications#118
feat: add E2E test framework for Electron applications#118l17728 wants to merge 1 commit intohellodigua:mainfrom
Conversation
l17728
commented
Apr 2, 2026
- Add TEST_MODE environment variable to bypass single-instance lock
- Create app-launcher helper for spawning Electron processes with debugging
- Enable parallel E2E test execution for improved CI/CD efficiency
- Provides foundation for cross-platform test automation
- 后面的所有提交都会使用本测试框架进行自动化测试和回归,先提交自动测试框架的PR
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 258d5fac56
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
tests/e2e/helpers/app-launcher.js
Outdated
| }, 100) | ||
| }).then((port) => { | ||
| if (port) return port | ||
| return findAvailablePort(port + 1) |
There was a problem hiding this comment.
Continue port scan from the requested range
When the first candidate port is unavailable, this branch recurses with port + 1, but port here is the .then(...) argument and is null on failure, so the next scan starts at 1 instead of startPort + 1. In parallel E2E runs this can probe privileged/irrelevant ports and introduce long retry delays or launch failures before reaching a usable port.
Useful? React with 👍 / 👎.
tests/e2e/helpers/app-launcher.js
Outdated
| env: { | ||
| ...process.env, | ||
| TEST_MODE: 'true', // E2E 测试模式:允许多个实例 | ||
| REMOTE_DEBUGGING_PORT: port, |
There was a problem hiding this comment.
Pass CDP port via Electron CLI switch
Setting REMOTE_DEBUGGING_PORT in the environment does not configure Electron's remote debugging endpoint; Electron expects the --remote-debugging-port=<port> launch argument. Because the switch is never added to the spawned command, the app may start without an exposed CDP port, causing E2E clients to fail when attaching to the expected debugger endpoint.
Useful? React with 👍 / 👎.
258d5fa to
97d2197
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 97d2197827
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
tests/e2e/helpers/app-launcher.js
Outdated
| cleanup() | ||
| if (!completed) { | ||
| resolve(null) |
There was a problem hiding this comment.
Resolve timed-out port probes before flipping completion
In findAvailablePort, the timeout handler calls cleanup() before deciding whether to resolve(null). Because cleanup() sets completed = true, the subsequent if (!completed) branch is always false, so a probe that times out never resolves and later listen callbacks are also ignored (completed is already true). Under slow/loaded CI where the listen callback is delayed past 100ms, launchApp() can hang indefinitely while searching for a port.
Useful? React with 👍 / 👎.
tests/e2e/helpers/app-launcher.js
Outdated
| if (!proc.killed) { | ||
| proc.kill('SIGKILL') |
There was a problem hiding this comment.
Use liveness check instead of
proc.killed for force-kill
The force-kill fallback is gated by if (!proc.killed), but proc.killed becomes true immediately after sending the initial SIGTERM and does not mean the process has exited. This means the SIGKILL path is effectively disabled for stubborn Electron processes, so close() may resolve while the app is still running and leak processes across E2E runs.
Useful? React with 👍 / 👎.
97d2197 to
70e3d78
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 70e3d78705
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
tests/e2e/helpers/app-launcher.js
Outdated
| proc.on('exit', (code) => { | ||
| if (code !== null && code !== 0) { | ||
| console.error(`[AppLauncher] Electron 进程异常退出,退出码: ${code}`) | ||
| } |
There was a problem hiding this comment.
Treat non-zero early exits as launch failures
launchApp only records spawn-level failures via the 'error' event, but a child that starts and then exits with a non-zero code during startup is merely logged and still returned as a “launched” app. In that scenario, E2E runs proceed with a dead process and fail later when attaching to CDP, making failures flaky and harder to diagnose; this exit path should set a launch failure signal (or re-check proc.exitCode after the startup wait) and reject immediately.
Useful? React with 👍 / 👎.
tests/e2e/helpers/app-launcher.js
Outdated
| if (proc.killed) { | ||
| return |
There was a problem hiding this comment.
Short-circuit close when child has already exited
The early-return guard uses proc.killed, which remains false when a process exits on its own. If Electron crashes/exits before close() is called, this path still sets up a new exit listener (after the event already fired) and then waits for the 5s force-kill timer before resolving, adding unnecessary teardown delays and slowing failing test runs.
Useful? React with 👍 / 👎.
70e3d78 to
b81ebf8
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b81ebf8218
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
tests/e2e/helpers/app-launcher.js
Outdated
|
|
||
| // 检查进程是否已经退出(自行退出或被杀死) | ||
| // proc.killed 只在我们主动 kill 时为 true,不包括自行退出的情况 | ||
| if (proc.exitCode !== null || proc.signalDescription) { |
There was a problem hiding this comment.
Use signalCode in exited-process guard
Fresh evidence in this revision: the close() early-return guard now checks proc.signalDescription, but Node ChildProcess exposes signalCode instead, so processes that already exited via signal still fall through to the teardown path. In that case the new exit listener never fires (event already emitted), and close() waits the full 5-second force-kill timeout on every affected test, slowing and adding flakiness to failing E2E runs.
Useful? React with 👍 / 👎.
b81ebf8 to
ec76222
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ec7622211d
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| stdio: 'inherit', | ||
| env: { | ||
| ...process.env, | ||
| TEST_MODE: 'true', // E2E 测试模式:允许多个实例 |
There was a problem hiding this comment.
Isolate test instances with separate user data dirs
This launcher always sets TEST_MODE=true to bypass the single-instance lock, but it does not provide a per-instance userData override, so parallel E2E runs will share the same on-disk profile and databases. Because the main process still reads/writes default app.getPath('userData') storage paths, concurrent test processes can interfere with each other (state leakage, lock contention, flaky assertions) and may mutate a developer’s real local profile when running tests outside CI.
Useful? React with 👍 / 👎.
tests/e2e/helpers/app-launcher.js
Outdated
| proc.on('exit', (code) => { | ||
| exitCode = code |
There was a problem hiding this comment.
Treat signal-terminated startup exits as launch failures
The startup exit handler only records the numeric code, but Node reports signal-based termination as code === null with a non-null signal. In that case the post-wait check (exitCode !== null && exitCode !== 0) is bypassed and launchApp() can return success even though Electron has already died, causing downstream CDP attach failures that are harder to diagnose.
Useful? React with 👍 / 👎.
ec76222 to
73facd7
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 73facd79e3
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| // 这防止了并发进程的状态泄漏、死锁和数据库冲突 | ||
| const e2eUserDataDir = process.env.CHATLAB_E2E_USER_DATA_DIR | ||
| if (e2eUserDataDir) { | ||
| app.setPath('userData', e2eUserDataDir) |
There was a problem hiding this comment.
Skip legacy data migration in E2E userData mode
Setting app.setPath('userData', e2eUserDataDir) here reroutes all startup data operations to the test directory, but init() still runs migrateDataIfNeeded(). In this repository, migrateFromLegacyDir() removes Documents/ChatLab after copying, so running E2E locally on a machine that still has legacy data can migrate/delete real user data into a disposable E2E path. Please gate legacy migration off in TEST_MODE (or when CHATLAB_E2E_USER_DATA_DIR is set) to avoid destructive side effects outside CI.
Useful? React with 👍 / 👎.
tests/e2e/helpers/app-launcher.js
Outdated
| server.close(() => { | ||
| resolve(port) | ||
| }) |
There was a problem hiding this comment.
Reserve the CDP port instead of releasing before spawn
This returns the candidate port only after closing the probe socket, so the port is no longer reserved during the gap before Electron binds --remote-debugging-port. In parallel E2E starts, two workers can both observe the same port as free and then race to claim it, causing intermittent launch/attach failures. The launcher should avoid this TOCTTOU window (for example by deterministic per-worker ports or by holding a reservation until process startup).
Useful? React with 👍 / 👎.
…rt management, instance isolation, and safety guards
## 功能
添加 Electron 应用的 E2E 测试启动框架,支持并行实例运行、CDP 调试、独立数据隔离、完善的故障检测和数据保护。
### 核心特性
#### 1. 可靠的端口扫描和预留机制
- 自动查找可用的 TCP 端口(默认从 9222 开始)
- **新增**:保持端口预留直到进程启动,防止 TOCTTOU 竞态
- 修复:递归逻辑中的 null + 1 bug
- 添加最大重试限制(100 次),防止无限递归
- 100ms 超时保护,快速检测端口占用
- 并发安全的 completed 标志和资源清理
#### 2. 正确的 CDP 端口配置
- 使用 --remote-debugging-port=<port> 命令行参数
- 移除无效的 REMOTE_DEBUGGING_PORT 环境变量
- 确保 E2E 客户端能正确连接调试器
#### 3. 完善的启动检测和错误处理
- 验证应用目录和 Electron 可执行文件存在性
- 捕获 spawn 错误事件并提供清晰错误信息
- 检测启动期间的早期退出(非零退出码)
- 检测信号终止的启动失败(SIGKILL、SIGTERM 等)
- 启动后立即抛出错误而不仅记录日志
- 避免连接到死进程,失败诊断清晰
#### 4. 并行实例隔离和数据保护
- 为每个 E2E 实例创建独立的用户数据目录
- 基于端口号生成唯一路径:`/tmp/chatlab-e2e-{port}`
- 通过环境变量 CHATLAB_E2E_USER_DATA_DIR 传递给主进程
- 主进程自动调用 app.setPath('userData', dir) 隔离存储
- **新增**:E2E 模式下跳过遗留数据迁移,保护用户真实数据
- 防止并发进程的状态泄漏、死锁、数据库冲突
#### 5. 优雅且高效的进程管理
- 防止多次调用 close() 导致的事件监听器泄漏
- 检查 proc.exitCode 和 proc.signalCode
- 已退出进程立即返回(无 5s 延迟)
- 使用 SIGTERM 允许进程正常清理资源
- 5 秒超时后强制 SIGKILL 防止僵尸进程
- 使用活力检查(proc.kill(0))替代 proc.killed
- 使用正确的 Node.js API signalCode(不是 signalDescription)
- 强制 SIGKILL 路径对 stubborn 进程有效
- 进程退出时立即清除超时定时器
#### 6. 并行实例支持
- 在主进程添加 TEST_MODE 环境变量检查
- 绕过单实例锁允许多个 Electron 实例
- 跳过遗留迁移保护用户数据
- 每个实例自动分配不同的 CDP 端口和数据目录
### 文件变更
- `electron/main/index.ts`:
* 添加 TEST_MODE 检查,绕过单实例锁
* 添加 CHATLAB_E2E_USER_DATA_DIR 读取和隔离
* **新增**:跳过 E2E 模式下的遗留迁移
- `tests/e2e/helpers/app-launcher.js`: 完整的应用启动管理模块(260+ 行)
* **新增**:port 预留机制,防止 TOCTTOU 竞态
### 使用示例
```javascript
const { launchApp } = require('./tests/e2e/helpers/app-launcher')
// 启动应用(自动查找可用端口和独立数据目录)
const app = await launchApp()
console.log('CDP 端口:', app.port)
// 运行测试...
// 关闭应用(已退出进程快速返回)
await app.close()
```
### 可配置选项
```javascript
await launchApp({
port: 9222, // 指定端口,默认自动查找(带预留)
userDataDir: '/custom/path', // 自定义用户数据目录
startupWaitTime: 2000 // 启动等待时间(毫秒),默认 2000
})
```
### 并行测试场景
✅ 多个实例同时运行(每个实例独立端口和数据目录)
✅ 自动端口分配(9222, 9223, 9224...)
✅ **端口预留防止竞态**(同时启动无冲突)
✅ 自动数据隔离(/tmp/chatlab-e2e-9222, /tmp/chatlab-e2e-9223...)
✅ 资源正确清理(无泄漏)
✅ 启动失败快速失败(清晰诊断)
✅ 进程强制清理(防止僵尸进程)
✅ 慢速 CI 环境中不挂起
✅ Stubborn 进程能被正确杀死
✅ 已退出进程快速检测和返回
✅ 信号退出的进程也能快速返回
✅ 无共享状态,支持真正的并行测试
✅ 用户数据受保护,不会被测试污染
### 修复的关键问题
**P1 问题:**
- 端口扫描超时导致的无限挂起
- 进程强制杀死被 proc.killed 误导的竞态条件
- 启动期间的早期退出未被检测
- 并行 E2E 实例共享 userData 导致冲突
- **新增**:遗留迁移在 E2E 模式下可能删除用户数据
- **新增**:端口分配 TOCTTOU 竞态导致启动冲突
**P2 问题:**
- 慢速 CI 中的 listen 回调延迟
- close() 对已退出进程的 5 秒不必要延迟
- signalCode API 错误
- 信号终止的启动失败未被检测
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
73facd7 to
dc99853
Compare